Date: Tue, 2 Aug 2022 04:32:42 +0200 From: "CIKM 2022 Short Paper Track" To: Hossein Fani Subject: CIKM2022 Short Paper Track notification for paper 5681 Sender: cikm2022-short@easychair.org Return-Path: cikm2022-short@easychair.org Dear Hossein Fani I am very pleased to inform you that your paper Resource: A Benchmark Library for Neural Team Formation has been accepted for presentation at CIKM’22 Short Paper track. Attached below are the reviewer's comments, which we hope you will find helpful for your preparation of the camera-ready version. Instructions for producing the camera-ready copy will be sent to you soon. The camera-ready deadline is *** August 15, 2022 ***. If you have any question please let us know. Thank you for submitting to CIKM’22, and please accept our congratulations on your paper's acceptance. Sincerely, Wendy Wang and Latifur Khan SUBMISSION: 5681 TITLE: Resource: A Benchmark Library for Neural Team Formation ------------------------- METAREVIEW ------------------------ The paper proposes a benchmark library for neural team formation. The reviewers point out some strengths of the proposed library. The resource is openly available, and the paper provides a good tutorial how to use the library. The reviewers also highlight some weaknesses. Most importantly, the paper does not explain differences to existing libraries for similar purposes with enough detail. The models supported by the library are limited in several senses. The paper supports only a single dense representation, namely Team2Vec, and it is unclear how to include other representations. The paper is also limited to a rather small number of algorithms, which could limit its impact. It could be useful to add a discussion of related work and possible use cases. The reviews, especially R3, also point out a number of possible future extensions. Taking the reviews into account, the proposed library is useful for its specific task, even though that task is rather narrow. The recommendation is therefore to accept the paper. ----------------------- REVIEW 1 --------------------- SUBMISSION: 5681 TITLE: Resource: A Benchmark Library for Neural Team Formation AUTHORS: Arman Dashti, Karan Saxena, Dhwani Patel and Hossein Fani ----------- Overall evaluation ----------- SCORE: 2 (accept) ----- TEXT: This paper presents a library for benchmarking neural models for the formation of teams. It is a very interesting paper, well presented, with results and evaluations (which are really appreciated in a resource paper). The resource is openly available on Github, under a license that allows its exploitation, and the authors have prepared a demo video to show how one can start to use the software. The only missing point of the paper (I understand that the space is very limited) could be additional real-use cases where this solution can help, to motivate its impact in the mid-long term. ----------- Strengths and reasons to accept ----------- - Well presented - Experiments and evaluations performed - Resource available and demo/guideline provided ----------- Weaknesses and limitations ----------- - More real-use cases to motivate its impact ----------------------- REVIEW 2 --------------------- SUBMISSION: 5681 TITLE: Resource: A Benchmark Library for Neural Team Formation AUTHORS: Arman Dashti, Karan Saxena, Dhwani Patel and Hossein Fani ----------- Overall evaluation ----------- SCORE: 1 (weak accept) ----- TEXT: The paper provides OpeNTF, a python-based benchmark library for neural team formation. In my opinion, the introduction of the paper could be more straightforward. However, overall, the article is well-written and easy to understand primarily. ----------- Strengths and reasons to accept ----------- The paper clearly describes the purpose of the work, shows how it works, and evaluation on Github. The OpeNTF library is available on Github and can be tested, and the source code is also well-documented. ----------- Weaknesses and limitations ----------- The related work might be needed in the paper separately. If it's included in the paper, it would be sounder and better to understand. The contents of the introduction and the subsection contribution overlapped partially. ----------------------- REVIEW 3 --------------------- SUBMISSION: 5681 TITLE: Resource: A Benchmark Library for Neural Team Formation AUTHORS: Arman Dashti, Karan Saxena, Dhwani Patel and Hossein Fani ----------- Overall evaluation ----------- SCORE: -1 (weak reject) ----- TEXT: In the paper entitled "Resource: A Benchmark Library for Neural Team Formation" the authors describe OpeNFT, a benchmarking tool from the area of team construction. The authors support only neural team formation. The paper assumes, that skills are present in specific teams and experts are part of teams. As examples the authors mention a scientific paper as team, the authors of the paper as the experts and the paper's keywords are the skills associated with the team/authors. The code for the benchmarking tool is available on GitHub. The benchmark library currently contains 2 baseline implementations of algorithms (a feed-forward neural network (FNN) and the SOTA variational Bayesian neural network (BNN), more dedicated algorithms do not seem to exist for the task) and additionally a random assignment option. For both algorithms there are two possible representations for skills and teams are provided: sparse vectors and dense vectors in form of Team2Vec. For the two baseline implementations there are 3 options to generate negative samples: uniform sampling, unigram sampling and unigram_b sampling. Another option is not using negative samples. As datasets to run the algorithms the benchmarking suite offers dblp v12, imdb and USPT. This leads to 16 variations of the two algorithms + 1 random baseline. --- Novelty: * The resource claims to surpass the existing resource PyTFL in terms of scalability, reproducibility, and extensibility. The aspect of scalability is explained by including a parallel step in the data pre-processing while constructing the "data" object after the data was read in and before the data is transformed into a sparse representation. * The advancement seems to be incremental. Availability: * The resource is available online. * The description fits the code in the repository. * Licensing seems correct and is open enough for academic usage. For-profit usage is not allowed. * The data is not collected from humans. Utility: * The resource is well documented except for advanced methods (e.g., defining another notion of success), the level of expertise required to use the code is intermediate. * There is a jupyter notebook which can serve as an example. * The benchmark library includes data which can be loaded. * The pre-processing steps for the data have not been described at all. Predicted Impact: * The benchmark library is only applicable for the neural team formation task. Approaches cannot contain sophisticated data (not contained in the existing datasets), explainability or a temporal component. * The research area does not yet seem to be well established; it has been existing for a few years but there are very few publications on the topic. * I do not expect the resource will be useful for a long time. The datasets need to be extended for more complex algorithms, the vector representation methods need to be updated, the benchmarking system does not support explainability or a temporal component in the approaches which could be included. * The anticipated research user community is small. ----------- Strengths and reasons to accept ----------- The paper has very good availability and good utility. * The GitHub repository is well structured, the code is readable, there is activity in the repository in form of commits, issues and discussions. * The imdb data was fetched in the appropriate/allowed way. * Licenses seem to be compatible and sufficiently open. * The benchmark library offers a variety of negative sampling options. * The paper is written in a straightforward, clear and understandable way. ----------- Weaknesses and limitations ----------- The benchmarking system has very limited functionality (2 algorithms with 4 negative sampling methods and 2 content representation options), predicted impact and novelty. Weaknesses and Limitations: * The authors do not convincingly describe the differences between OpeNTF and PyTFL. Why did they not extend, modify or collaborate on PyTFL instead of creating a new benchmark library? * The authors have the strong assumption that every member of a team has the skills associated with a team. This definition might not depict reality in some cases (e.g., not all authors of this paper might be experts in negative sampling methods) but a discussion/identification of this assumption is missing. * Section 2, page 2: In the authors' definition of the neural team formation task, a team is predicted to be successful depending on "previous successful teams". In the evaluation setup, the time dimension is disregarded in the train-test-split of data. More sophisticated neural task formation models including a temporal component will not be supported by the benchmark library. * For the dense representation of vectors, the authors only offer Team2Vec. What other (e.g., BERT based) options could be offered to represent the information? * A mention of changes over time in teams or skills and potential bias in the datasets is missing. * A mention of explainability of resulting teams is missing. The authors state the current benchmark system should be used by organisers or practitioners to construct teams by applying one of the provided algorithms. Independent of the area of application (scientific collaboration, movies, patents) potential users should not be expected to blindly trust the systems. They require explanations of the results. To be future proof, the authors should strive to incorporate this aspect in their system. * The authors do not state how the repository will be maintained. * The authors do not state how and if other (more extensive or nuanced) datasets are going to be included. Minor points: * Section 1, page 1: "For scalability, OpeNTF employs parallel execution at the data pre-processing step, and gpu-acceleration for model training, validation, and test which is the standard practice in machine learning research yet overlooked in the task of neural team formation to date." - Why does it help with scalability to only parallelise the construction step of "data". Why is this the bottleneck? * How/where can a user of the benchmark library define their own notion of success of a team? * The authors' Team2Vec code copies part of the reference implementation by Rad et al. but the authors do not discuss their changes. E.g., why was the window size changed, how does the low number of epochs (only 10 compared to 100 in the reference implementation) influence the performance of the model? * The data pre-processing and datasets are not described. * The required imdb licensing phrase from their web page is not included in the repository. * The implementation by Rad et al. Team2Vec was partially copied but was not mentioned in the code. * The choice of evaluation measures is not motivated.